168 research outputs found

    Improving RNN-Transducers with Acoustic LookAhead

    Full text link
    RNN-Transducers (RNN-Ts) have gained widespread acceptance as an end-to-end model for speech to text conversion because of their high accuracy and streaming capabilities. A typical RNN-T independently encodes the input audio and the text context, and combines the two encodings by a thin joint network. While this architecture provides SOTA streaming accuracy, it also makes the model vulnerable to strong LM biasing which manifests as multi-step hallucination of text without acoustic evidence. In this paper we propose LookAhead that makes text representations more acoustically grounded by looking ahead into the future within the audio input. This technique yields a significant 5%-20% relative reduction in word error rate on both in-domain and out-of-domain evaluation sets.Comment: 5 pages, 1 fig, 7 tables, Proceedings of Interspeech 202

    Intelligent Self-Repairable Web Wrappers

    Get PDF
    The amount of information available on the Web grows at an incredible high rate. Systems and procedures devised to extract these data from Web sources already exist, and different approaches and techniques have been investigated during the last years. On the one hand, reliable solutions should provide robust algorithms of Web data mining which could automatically face possible malfunctioning or failures. On the other, in literature there is a lack of solutions about the maintenance of these systems. Procedures that extract Web data may be strictly interconnected with the structure of the data source itself; thus, malfunctioning or acquisition of corrupted data could be caused, for example, by structural modifications of data sources brought by their owners. Nowadays, verification of data integrity and maintenance are mostly manually managed, in order to ensure that these systems work correctly and reliably. In this paper we propose a novel approach to create procedures able to extract data from Web sources -- the so called Web wrappers -- which can face possible malfunctioning caused by modifications of the structure of the data source, and can automatically repair themselves.\u

    Predicting your next OLAP query based on recent analytical sessions

    Get PDF
    International audienceIn Business Intelligence systems, users interact with data warehouses by formulating OLAP queries aimed at exploring multidimensional data cubes. Being able to predict the most likely next queries would provide a way to recommend interesting queries to users on the one hand, and could improve the efficiency of OLAP sessions on the other. In particular, query recommendation would proactively guide users in data exploration and improve the quality of their interactive experience. In this paper, we propose a framework to predict the most likely next query and recommend this to the user. Our framework relies on a probabilistic user behavior model built by analyzing previous OLAP sessions and exploiting a query similarity metric. To gain insight in the recommendation precision and on what parameters it depends, we evaluate our approach using different quality assessments

    Database Learning: Toward a Database that Becomes Smarter Every Time

    Full text link
    In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM SIGMOD conference 201

    Distributed Caching for Processing Raw Arrays

    Get PDF
    As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate this problem through a series of techniques. In-situ mechanisms provide direct access to raw data in the original format---without loading and partitioning. Parallel processing scales to the largest datasets. In-memory caching reduces latency when the same data are accessed across a workload of queries. However, we are not aware of any work on distributed caching of multi-dimensional raw arrays. In this paper, we introduce a distributed framework for cost-based caching of multi-dimensional arrays in native format. Given a set of files that contain portions of an array and an online query workload, the framework computes an effective caching plan in two stages. First, the plan identifies the cells to be cached locally from each of the input files by continuously refining an evolving R-tree index. In the second stage, an optimal assignment of cells to nodes that collocates dependent cells in order to minimize the overall data transfer is determined. We design cache eviction and placement heuristic algorithms that consider the historical query workload. A thorough experimental evaluation over two real datasets in three file formats confirms the superiority - by as much as two orders of magnitude - of the proposed framework over existing techniques in terms of cache overhead and workload execution time

    Genotype x environment interaction and stability of indigenous coriander (Coriandrum sativum L.) genotypes for seed yield in different agro-climatic zones of Chhattisgarh

    Get PDF
    The present study was conducted to find out the stability and yield performances of 13 genotypes of indigenous coriander (Coriandrum sativum L.) evaluated in different agro climatic zones of Chhattisgarh. The trials were laid out in a Randomized Block Design (RBD) with three replications at three locations for three years resulting in nine environments (Genotype Ă— year interactions). The genotypes and G Ă— E interactions revealed significant differences at p <0.01 for seed yield indicating varieties and testing environments were distinct from each other. Additive main effects and multiplicative interaction analysis (AMMI-biplot) indicated that the yield performances of indigenous coriander genotypes were highly affected by the environments. The first two principal component axes (PCA 1 and PCA 2) were significant and they explained 67% of the total genotype x environment interaction of which 42.4% and 24.6% were represented by PCA 1 and PCA 2, respectively. A biplot generated using genotypic and environmental scores of the first two AMMI components demonstrated that genotype with larger PCA 1 and lower PCA 2 scores were high yielding and stable genotypes and genotypes with lower PCA 1 and larger PCA 2 scores were low yielding and unstable cultivars in tested locations. The genotype GC 5 C-101 (ICS 4) showed higher grain yields (16.35 q ha-1) over grand mean (13.03 q ha-1) and also had the minimum PCA 1 score, minimum AMMI stability value (ASV) and yield stability index (YSI). Therefore genotype ICS 4 (Chhattisgarh Shri Chandrahasini Dhaniya -2) showed wider stability across different agro climatic environments of Chhattisgarh
    • …
    corecore